Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring, arg list handling, add two parameters #11

Merged
merged 18 commits into from
Mar 1, 2024

Conversation

StevenCannon-USDA
Copy link
Contributor

Many changes in this branch.

  • The largest number of code changes involve moving the calculations of alignments, trees, and HMMs into pandagma-common.sh.
  • I added two parameters to help avoid unnecessary calculations of alignments, trees, and HMMs on very small families.
  • I also made some fixes in pandagma-pan.sh to deal with "argument list too long" - which I observed when calculating pangenes for 57 Glycine annotations. (ARG_MAX is 2097152 on both Ceres and NERSC/Perlmutter, but I saw this error only when running on Perlmutter). My fix was simpleminded, using pushd/popd to remove repeated inclusion of directory names. There might be better solutions (xargs or the like), but I think my fix should be OK for the time-being.

I have tested the changes on two large (and pretty costly) production runs: pan for 57 Glycine annotations and fam on 25 species.

@nathanweeks
Copy link
Contributor

nathanweeks commented Feb 28, 2024

It looks like existing conf files need to be updated; thepandagma pan GitHub Action workflow jobs are failing with errors like:

2024-02-28T11:28:44.5818665Z /home/runner/work/pandagma/pandagma/bin/pandagma-common.sh: line 81: min_align_count: unbound variable
2024-02-28T11:28:44.5838371Z ERROR conda.cli.main_run:execute(124): `conda run pandagma pan -d data_TEfilter -c config/Arachis3_5_2_0.conf -w /work -r -o upload-artifact/pandagma-pan-Arachis3_5_2_0` failed. (See above for error)

https://github.com/legumeinfo/pandagma/actions/runs/8078741346

…ndagma-fsup.sh to work with the pandagma driver script and pandagma-common.sh
@StevenCannon-USDA
Copy link
Contributor Author

I think I've fixed at least those config omissions.

@nathanweeks
Copy link
Contributor

From the results of the last workflow run:
https://github.com/legumeinfo/pandagma/actions/runs/8087305867

The new default workflow steps in pandagma pan (align_cds, align_protein, model_and_trim, calc_trees xfr_aligns_trees) resulted in a >3X increase in runtime, causing the Medicago3_8_0_15 to exceed the GitHub Actions 6-hour job walltime limit. I'll remove that from the default list of configurations that are tested.

The Glycine_7_3_2 and Vigna4_7_1_4 jobs completed, but encountered an OS "too many open files" error when running actions/upload-artifact; e.g.:

With the provided path, there will be 152583 files uploaded
Artifact name is valid!
Root directory input is valid!
Beginning upload of artifact content to blob storage
node:events:492
      throw er; // Unhandled 'error' event
      ^

Error: EMFILE: too many open files, open '/home/runner/work/pandagma/pandagma/upload-artifact/pandagma-pan-Glycine_7_3_2/22_hmmalign/pan22656'
Emitted 'error' event on ReadStream instance at:
    at emitErrorNT (node:internal/streams/destroy:151:8)
    at emitErrorCloseNT (node:internal/streams/destroy:[11](https://github.com/legumeinfo/pandagma/actions/runs/8087305867/job/22098949476#step:8:12)6:3)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
  errno: -24,
  code: 'EMFILE',
  syscall: 'open',
  path: '/home/runner/work/pandagma/pandagma/upload-artifact/pandagma-pan-Glycine_7_3_2/22_hmmalign/pan22656'
}

Node.js v[20](https://github.com/legumeinfo/pandagma/actions/runs/8087305867/job/22098949476#step:8:21).8.1

The previous successful run of Glycine_7_3_2 resulted in only 23 files in the out_pandagma directory (instead of 152583 files).

The extra files are in the 21_hmm, 22_hmmalign, 23_hmmalign_trim2, and 24_trees directories. Are these needed in the out_pandagma for pandagma pan, or only in work_pandagma?

@StevenCannon-USDA
Copy link
Contributor Author

StevenCannon-USDA commented Feb 29, 2024

For the default workflows, I think we should NOT include these steps:

pan: align_cds, align_protein, model_and_trim, calc_trees, xfr_aligns_trees
fam: align_protein, model_and_trim, calc_trees, xfr_aligns_trees

Omitting these should fix both the runtime and file-count problems.

What determines which steps are included in "default" in the GitHub actions?

@nathanweeks
Copy link
Contributor

nathanweeks commented Feb 29, 2024

What determines which steps are included in "default" in the GitHub actions?

The GitHub Actions workflow is running the default sequence of pandagma workflow steps run when the pandagma pan -s <STEP> option is omitted:

env PATH=$PWD/bin:$PATH time -v conda run -n pandagma --no-capture-output pandagma pan -d data_TEfilter -c "config/${PANDAGMA_CONF}.conf" -w /work -r -o ${PANDAGMA_OUT} 2>&1 | tee ${PANDAGMA_OUT}/pandagma.log

@StevenCannon-USDA
Copy link
Contributor Author

StevenCannon-USDA commented Feb 29, 2024

The GitHub Actions workflow is running the default sequence of pandagma workflow steps run when the pandagma pan -s option is omitted

OK. I'll pull the optional steps out of -pan and -fam (just removing them from commandlist) and do some more testing. I'll convert this PR to draft until then.

@StevenCannon-USDA StevenCannon-USDA marked this pull request as draft February 29, 2024 13:12
@StevenCannon-USDA StevenCannon-USDA marked this pull request as ready for review February 29, 2024 21:54
@StevenCannon-USDA
Copy link
Contributor Author

@nathanweeks Ready for review again. I tested the fam workflow with family_7_3.conf, the pan workflow with Vigna4_7_1_4.conf, and the fsup workflow with family3_sup.conf and family3_sup_chafa.conf (but I would say don't bother testing fsup; I think I've got that one - and it is still peripheral.)

@nathanweeks
Copy link
Contributor

GA workflow passed.

@nathanweeks nathanweeks merged commit 030b8ad into main Mar 1, 2024
5 checks passed
@nathanweeks nathanweeks deleted the min_align_count branch March 1, 2024 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants